Discourse indicators for content selection in summarization
نویسندگان
چکیده
We present analyses aimed at eliciting which specific aspects of discourse provide the strongest indication for text importance. In the context of content selection for single document summarization of news, we examine the benefits of both the graph structure of text provided by discourse relations and the semantic sense of these relations. We find that structure information is the most robust indicator of importance. Semantic sense only provides constraints on content selection but is not indicative of important content by itself. However, sense features complement structure information and lead to improved performance. Further, both types of discourse information prove complementary to non-discourse features. While our results establish the usefulness of discourse features, we also find that lexical overlap provides a simple and cheap alternative to discourse for computing text structure with comparable performance for the task of content selection.
منابع مشابه
Discourse Indicators for Content Selection in Summaization
We present analyses aimed at eliciting which specific aspects of discourse provide the strongest indication for text importance. In the context of content selection for single document summarization of news, we examine the benefits of both the graph structure of text provided by discourse relations and the semantic sense of these relations. We find that structure information is the most robust ...
متن کاملTowards Coherent Multi-Document Summarization
This paper presents G-FLOW, a novel system for coherent extractive multi-document summarization (MDS).1 Where previous work on MDS considered sentence selection and ordering separately, G-FLOW introduces a joint model for selection and ordering that balances coherence and salience. G-FLOW’s core representation is a graph that approximates the discourse relations across sentences based on indica...
متن کاملContextual Salience in Query-based Summarization
Discourse theories claim that text gets meaning in context. Most summarization systems do not take advantage of this. They assess the relevance of each passage individually rather than modeling the way context affects the relevance of passages. This paper presents a framework for graph-based summarization in order to model relations in text, so that the passages can be viewed in a broader conte...
متن کاملMining Discourse Markers For Chinese Textual Summarization
Discourse markers foreshadow the message thrust of texts and saliently guide their rhetorical structure which are important for content filtering and text abstraction. This paper reports on efforts to automatically identify and classify discourse markers in Chinese texts using heuristic-based and corpus-based data-mining methods, as an integral part of automatic text summarization via rhetorica...
متن کاملJoint semantic discourse models for automatic multi-document summarization
Automatic multi-document summarization aims at selecting the essential content of related documents and presenting it in a summary. In this paper, we propose some methods for automatic summarization based on Rhetorical Structure Theory and Cross-document Structure Theory. They are chosen in order to properly address the relevance of information, multidocument phenomena and subtopical distributi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010